59 research outputs found

    Insights into the feature selection problem using local optima networks

    Get PDF
    The binary feature selection problem is investigated in this paper. Feature selection fitness landscape analysis is done, which allows for a better understanding of the behaviour of feature selection algorithms. Local optima networks are employed as a tool to visualise and characterise the fitness landscapes of the feature selection problem in the context of classification. An analysis of the fitness landscape global structure is provided, based on seven real-world datasets with up to 17 features. Formation of neutral global optima plateaus are shown to indicate the existence of irrelevant features in the datasets. Removal of irrelevant features resulted in a reduction of neutrality and the ratio of local optima to the size of the search space, resulting in improved performance of genetic algorithm search in finding the global optimum

    Applications of Nature-Inspired Algorithms for Dimension Reduction: Enabling Efficient Data Analytics

    Get PDF
    In [1], we have explored the theoretical aspects of feature selection and evolutionary algorithms. In this chapter, we focus on optimization algorithms for enhancing data analytic process, i.e., we propose to explore applications of nature-inspired algorithms in data science. Feature selection optimization is a hybrid approach leveraging feature selection techniques and evolutionary algorithms process to optimize the selected features. Prior works solve this problem iteratively to converge to an optimal feature subset. Feature selection optimization is a non-specific domain approach. Data scientists mainly attempt to find an advanced way to analyze data n with high computational efficiency and low time complexity, leading to efficient data analytics. Thus, by increasing generated/measured/sensed data from various sources, analysis, manipulation and illustration of data grow exponentially. Due to the large scale data sets, Curse of dimensionality (CoD) is one of the NP-hard problems in data science. Hence, several efforts have been focused on leveraging evolutionary algorithms (EAs) to address the complex issues in large scale data analytics problems. Dimension reduction, together with EAs, lends itself to solve CoD and solve complex problems, in terms of time complexity, efficiently. In this chapter, we first provide a brief overview of previous studies that focused on solving CoD using feature extraction optimization process. We then discuss practical examples of research studies are successfully tackled some application domains, such as image processing, sentiment analysis, network traffics / anomalies analysis, credit score analysis and other benchmark functions/data sets analysis

    Evolutionary Computation, Optimization and Learning Algorithms for Data Science

    Get PDF
    A large number of engineering, science and computational problems have yet to be solved in a computationally efficient way. One of the emerging challenges is how evolving technologies grow towards autonomy and intelligent decision making. This leads to collection of large amounts of data from various sensing and measurement technologies, e.g., cameras, smart phones, health sensors, smart electricity meters, and environment sensors. Hence, it is imperative to develop efficient algorithms for generation, analysis, classification, and illustration of data. Meanwhile, data is structured purposefully through different representations, such as large-scale networks and graphs. We focus on data science as a crucial area, specifically focusing on a curse of dimensionality (CoD) which is due to the large amount of generated/sensed/collected data. This motivates researchers to think about optimization and to apply nature-inspired algorithms, such as evolutionary algorithms (EAs) to solve optimization problems. Although these algorithms look un-deterministic, they are robust enough to reach an optimal solution. Researchers do not adopt evolutionary algorithms unless they face a problem which is suffering from placement in local optimal solution, rather than global optimal solution. In this chapter, we first develop a clear and formal definition of the CoD problem, next we focus on feature extraction techniques and categories, then we provide a general overview of meta-heuristic algorithms, its terminology, and desirable properties of evolutionary algorithms

    Differential evolution for filter feature selection based on information theory and feature ranking

    No full text
    © 2017 Elsevier B.V. Feature selection is an essential step in various tasks, where filter feature selection algorithms are increasingly attractive due to their simplicity and fast speed. A common filter is to use mutual information to estimate the relationships between each feature and the class labels (mutual relevancy), and between each pair of features (mutual redundancy). This strategy has gained popularity resulting a variety of criteria based on mutual information. Other well-known strategies are to order each feature based on the nearest neighbor distance as in ReliefF, and based on the between-class variance and the within-class variance as in Fisher Score. However, each strategy comes with its own advantages and disadvantages. This paper proposes a new filter criterion inspired by the concepts of mutual information, ReliefF and Fisher Score. Instead of using mutual redundancy, the proposed criterion tries to choose the highest ranked features determined by ReliefF and Fisher Score while providing the mutual relevance between features and the class labels. Based on the proposed criterion, two new differential evolution (DE) based filter approaches are developed. While the former uses the proposed criterion as a single objective problem in a weighted manner, the latter considers the proposed criterion in a multi-objective design. Moreover, a well known mutual information feature selection approach (MIFS) based on maximum-relevance and minimum-redundancy is also adopted in single-objective and multi-objective DE algorithms for feature selection. The results show that the proposed criterion outperforms MIFS in both single objective and multi-objective DE frameworks. The results also indicate that considering feature selection as a multi-objective problem can generally provide better performance in terms of the feature subset size and the classification accuracy. © This manuscript version is made available under the CC-BY-NC-ND 4.0 license https://creativecommons.org/licenses/by-nc-nd/4.0

    A binary ABC algorithm based on advanced similarity scheme for feature selection

    No full text
    © 2015 Elsevier B.V. All rights reserved. Feature selection is the basic pre-processing task of eliminating irrelevant or redundant features through investigating complicated interactions among features in a feature set. Due to its critical role in classification and computational time, it has attracted researchers' attention for the last five decades. However, it still remains a challenge. This paper proposes a binary artificial bee colony (ABC) algorithm for the feature selection problems, which is developed by integrating evolutionary based similarity search mechanisms into an existing binary ABC variant. The performance analysis of the proposed algorithm is demonstrated by comparing it with some well-known variants of the particle swarm optimization (PSO) and ABC algorithms, including standard binary PSO, new velocity based binary PSO, quantum inspired binary PSO, discrete ABC, modification rate based ABC, angle modulated ABC, and genetic algorithms on 10 benchmark datasets. The results show that the proposed algorithm can obtain higher classification performance in both training and test sets, and can eliminate irrelevant and redundant features more effectively than the other approaches. Note that all the algorithms used in this paper except for standard binary PSO and GA are employed for the first time in feature selection

    Supplementary Material for: Processes of Emotion Idioms Comprehension of Turkish-Speaking People with Wernice's Aphasia

    No full text
    Introduction: Idioms are commonly used in everyday language to convey emotions figuratively. The ability to comprehend and use idioms that incorporate emotional elements is crucial for effective communication in daily life, particularly among people with aphasia (PwA). Despite the interest in understanding the process of emotion idiom comprehension in PwA, limited information is available in the literature. Therefore, this study aimed to investigate the process of emotion idiom comprehension in people with Wernice's aphasia (PwWA) and compare it with that of neurotypical individuals. Methods: Sixty idioms were selected based on their syntactic and semantic features, and participants evaluated their imageability. Sixteen idioms were chosen for the study and two types of tasks were prepared: written idiom-picture matching and written idiom-written text matching. These tasks were administered to two groups: 11 PwWA and 11 neurotypical individuals. The results were analysed in terms of task performance, response type, syntactic and semantic features, and emotional content. Results: The emotion idiom comprehension scores of PwWA group were significantly lower than those of the neurotypical participants. PwWA had greater difficulty with the written idiom-picture matching task and tended to rely on the literal meanings of the idioms. There were differences in the semantic features between the two groups. Among the emotional idioms, PwWA showed significant differences in the types of emotions they were able to comprehend. Conclusions: The findings of this study suggest that regardless of the syntactic content of idioms, PwWA's ability to comprehend emotion idioms is impaired, and they tend to interpret them more literally. This study provides a useful method for assessing emotional idiom comprehension in PwA

    Angiotensin-converting enzyme gene polymorphism in arrhythmogenic right ventricular dysplasia: Is DD genotype helpful in predicting syncope risk?

    No full text
    Introduction. Arrhythmogenic right ventricular dysplasia (ARVD) is a heritable disorder characterised by fibrofatty replacement of right ventricular myocytes and increased risk of ventricular arrhythmias and sudden cardiac death. Angiotensin-converting enzyme (ACE) gene insertion/deletion (I/D)) polymorphism affects myocardial ACE levels. DD genotype favours myocardial fibrosis and is associated with malignant ventricular tachycardia. The aim of this study was to explore ACE gene polymorphism in ARVD patients
    corecore